Enhancing the Performance of Semi-Supervised Classification Algorithms with Bridging
نویسندگان
چکیده
Traditional supervised classification algorithms require a large number of labelled examples to perform accurately. Semi-supervised classification algorithms attempt to overcome this major limitation by also using unlabelled examples. Unlabelled examples have also been used to improve nearest neighbour text classification in a method called bridging. In this paper, we propose the use of bridging in a semi-supervised setting. We introduce a new bridging algorithm that can be used as a base classifier in any supervised approach such as co-training or selflearning. We empirically show that classification performance increases by improving the semi-supervised algorithm’s ability to correctly assign labels to previouslyunlabelled data.
منابع مشابه
Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کاملDetermination of Best Supervised Classification Algorithm for Land Use Maps using Satellite Images (Case Study: Baft, Kerman Province, Iran)
According to the fundamental goal of remote sensing technology, the image classification of desired sensors can be introduced as the most important part of satellite image interpretation. There exist various algorithms in relation to the supervised land use classification that the most pertinent one should be determined. Therefore, this study has been conducted to determine the best and most su...
متن کاملNearest Neighbour Classification with Background Knowledge Extended to Semi-supervised Learning
Semi supervised methods involve converting unlabelled data into high quality labelled data that can be used to improve the performance of conventional supervised methods that had previously been given a small training set. Unlabelled data has also been shown to be helpful in a supervised setting called ‘bridging’ where unlabelled data have been used to help relate labelled instances to those th...
متن کاملClassification of encrypted traffic for applications based on statistical features
Traffic classification plays an important role in many aspects of network management such as identifying type of the transferred data, detection of malware applications, applying policies to restrict network accesses and so on. Basic methods in this field were using some obvious traffic features like port number and protocol type to classify the traffic type. However, recent changes in applicat...
متن کاملUsing semi-supervised classifiers for credit scoring
In credit scoring, low-default portfolios are those for which very little default history exists. This makes it problematic for financial institutions to estimate a reliable probability of a customer defaulting on a loan. Banking regulation (Basel II Capital Accord), and best practice, however, necessitate an accurate and valid estimate of the probability of default. In this article the suitabi...
متن کامل